Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detections] Adds framework for replacing API schemas #82462

Merged
merged 25 commits into from
Nov 13, 2020

Conversation

marshallmain
Copy link
Contributor

@marshallmain marshallmain commented Nov 3, 2020

Summary

Refactors the detection engine rules APIs to reduce duplication and provide more static checks of consistency between the API and the internal rule structure.

The create and update API schemas are built from a single source of truth (in request/rule_schemas.ts). The response schema can also be built from this source of truth, but the response schemas are not fully replaced yet to keep this PR more manageable.

The rule saved object data structure is represented by a separate source of truth schema (in schemas/rule_schemas.ts). From this structure we can build the create, update, and response schemas that the alertsClient uses. This structure also allows validation of internal structures before sending them to the alertsClient. This is useful for the patch route, where we need to merge the patch request with an existing object and then check if the merged object is valid.

Conversion functions (in schemas/rule_converters.ts) provide conversions from create REST API -> create internal rule structure, and from internal rule response -> REST API response. These conversion functions help ensure that the internal schema stays in sync with the API schema because they'll throw errors if properties are added to either the API or internal schema and not the other one.

Future work:

  • Reduce duplication in import, prepackaged rules, patch routes
  • Use stricter schema on API response as well (replace rules_schema.ts)
    • Possibly return warning instead of error if response fails to validate? Bad news if existing rules fail to validate in the response schema and they become inaccessible through the API
  • Use internal rule schema for validation of rules in detection engine server logic

Advantages of this approach:

  • Viewing fields that are valid for a specific rule type is simple: check the object for that rule type in request/rule_schemas.ts, e.g. threatMatchRuleParams - fields are listed as either required, optional, or defaultable
  • Autocompletion in VSCode also automatically restricts to valid fields once the type is set for a rule
  • Create, update, patch, and response API schemas can all be generated from the same objects in request/rule_schemas.ts
    • This should prevent a class of bugs I encountered when I forgot to add a field to the response schema, which then allowed the rule to be created even though the server returned 500 - and subsequently returned 500 when attempting to retrieve the rule page with the new rule
  • New io-ts schema for internal rule data structure enables stricter validation for patch route
    • the patch route needs to check that the parameters that are received in the API are compatible with the existing rule parameters, which means we need a schema to validate the internal data structure after applying the patch
    • schemas/rule_schemas.ts provides this schema for the internal data structure
    • schemas/rule_converters.ts provides functions that convert between the API structure and the internal structure
      • These functions also statically check that the API and internal structures are in sync
    • the internal schema can also be used to validate the rule SO we retrieve during rule execution
      • ^ also provides better clarity about which fields are available for each rule type during execution
  • Reduces number of custom validation functions to write and maintain (create_rules_type_dependents.ts and update_rules_type_dependents.ts) by using stricter schema

Mixed:

  • defaults are no longer applied as part of the io-ts decode process. This makes the schemas simpler, but we still have to handle applying the defaults.

Disadvantages:

  • Still has problems with type specific patch route
    • Since type is not required when patching we can't easily distinguish between patch requests for different types of rules. This makes it harder to write a function like typeSpecificSnakeToCamel that works for patch requests.
  • indecipherable error messages if type is not specified in the rule - this is due to the union of the different type dependent rule schemas

Checklist

Delete any items that are not applicable to this PR.

For maintainers

@marshallmain marshallmain marked this pull request as ready for review November 10, 2020 17:47
@marshallmain marshallmain requested review from a team as code owners November 10, 2020 17:47
@marshallmain marshallmain added release_note:skip Skip the PR/issue when compiling release notes v7.11 Feature:Detection Rules Security Solution rules and Detection Engine Team:SIEM v8.0.0 labels Nov 10, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/siem (Team:SIEM)

import { createRuleValidateTypeDependents } from './create_rules_type_dependents';

describe('create_rules_type_dependents', () => {
test('saved_id is required when type is saved_query and will not validate without out', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are now covered by static type checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure these are covered in schema tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 added tests covering the same behavior in rule_schemas.test.ts

expect(message.schema).toEqual({});
});

test('[rule_id, description, from, to, name] does not validate', () => {
const payload: Partial<CreateRulesSchema> = {
const payload = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would try to keep to utilize as many strong types as we can as it's better to fail during static analysis and/or in the IDE than during test runtime. This also gives hints to any developer coming into the file when they're adding a new test by copying/pasting a previous test and modifying they will have the type for it.

For example, with a Partial<CreateRulesSchema> if you misspell something when writing a test like so below you will give faster positive feedback from your IDE rather than waiting for the jest test to run. Also when people unfamiliar with our code base refactor they will get typescript compiler errors first before jest errors and those are typically easier to figure out instead of jest errors.

Screen Shot 2020-11-11 at 2 46 18 PM

@@ -231,7 +191,7 @@ describe('create rules schema', () => {
});

test('[rule_id, description, from, to, name, severity, type, query, index, interval] does validate', () => {
const payload: CreateRulesSchema = {
const payload = {
Copy link
Contributor

@FrankHassanabad FrankHassanabad Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Same advice with having strong types applies to the payload/expected which is why we break them out and add types as it's easier to find errors, type mismatches, and get positive feedback directly from the IDE rather than rely on Jest.

Typically even in other areas of the codebase when I do refactors if I am working in a Jest test that is using an expect payload without a strong type but I can add one, I usually do add it during the refactor. Later it makes it that much easier to just run type checks, fix the problems all at once and then run the larger jest tests.

That and fixing type issues in my IDE is positive feedback when types/keys change on shapes of JS and I can fix all of the type issues at once rather than running the test, inspecting the output, and then fixing things that TypeScript could have let me know about in the IDE.

@@ -656,24 +421,20 @@ describe('create rules schema', () => {
});

test('saved_query type can have filters with it', () => {
const payload: CreateRulesSchema = {
...getCreateRulesSchemaMock(),
const payload: SavedQueryCreateSchema = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on keeping the type here. With a spread and an additional value, we have to explicitly add the type to ensure stricter checks from TS. Awesome!

});

test('filters cannot be a string', () => {
const payload: Omit<CreateRulesSchema, 'filters'> & { filters: string } = {
const payload = {
...getCreateRulesSchemaMock(),
filters: 'some string',
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok to do to weaken the types for a test if the Omit feels unergonomic and unwieldily to use to keep the type strict. You should get the majority of type safety from the function getCreateRulesSchemaMock still.

If you hover between the two types though of this payload and the getCreateRulesSchemaMock() you should notice the interesting thing which is TypeScript during a spread is changing the types slightly.

In regular (outside of tests) we have had some odd behaviors from spreads where TypeScript can change the types slightly or have an ambiguity situation where once we added the type to where we initiated the spread the errors/issues went away.

You might know all of this, just footnotes in case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ I added back most of the stronger types where the payload is intended to validate correctly or some fields are simply missing. I figure for cases where validation should fail because a field is the wrong type there's not as much benefit so I left them off for convenience.

});
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, if you want to cut down on some the dotted noise you can always do ...just... a bit of destructoring like so:

async (context, request, response) => {
  const { body } = request;

What's interesting is we have been following guidelines from Kibana here:
https://github.com/elastic/kibana/blob/master/STYLEGUIDE.md#use-object-destructuring

But I think as the rules grew in size, no one wants to destructor a large amount of data. I think this is all very readable as is, fwiw 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I ended up pulling the conversion out into a separate function so it was easy to just pass request.body into that function, best of both worlds.

if (isEmpty(removedNullValues)) {
return currentVersion;
} else {
return currentVersion + 1;
}
};

export const removeUndefined = (obj: object) => {
return pickBy((value: unknown) => value != null, obj);
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the technical term would be removeNil as this would be removed null and undefined values. The original was removedNullValues but that wasn't accurate as well.

At least nil is the terminology that lodash uses:
https://lodash.com/docs/4.17.15#isNil

No changes asked, just pointing it out.

};
}
default: {
return assertUnreachable(params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took out one of these statements and the exhaustive switch statement looks good and like it is working 👍

if (existingRule.enabled && enabled === false) {
await alertsClient.disable({ id: existingRule.id });
} else if (!existingRule.enabled && enabled === true) {
await alertsClient.enable({ id: existingRule.id });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free if you want to directly use the booleans:

  if (existingRule.enabled && !enabled) {
    await alertsClient.disable({ id: existingRule.id });
  } else if (!existingRule.enabled && enabled) {
    await alertsClient.enable({ id: existingRule.id });

Instead of the explicit === true or === false, those are probably just older bad habits from too much time with JS types that could be undefined or null, but I think these booleans are ok to use normally if you think that reads better.

No changes requested, just pointing out parts of code myself or others wrote that might read odd.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it helps, the alerting framework team helped us with this part as we have to put the enabling/disabling of rules after we do any updates first because once we enable/disable a rule it will invalidate API keys and can cause a "false run" to happen where the first run has a bad API key.

So right in this section when we call this line:

await alertsClient.enable({ id: existingRule.id });

That is going to start the rule and invalidate any previous API keys it might have had. So just a FYI for you if you haven't figured that part out yet looking through here.

@@ -4,179 +4,94 @@
* you may not use this file except in compliance with the Elastic License.
*/

/* eslint-disable complexity */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optionally you can bump this up to say 30 like so:

/* eslint complexity: ["error", 30]*/

If you don't want it to go higher than 30.

You could break up the params section here to another function and return the params separate if you wanted to reduce the complexity:

{
      author: ruleUpdate.author ?? [],
      buildingBlockType: ruleUpdate.building_block_type,
      description: ruleUpdate.description,
      ruleId: existingRule.params.ruleId,
      falsePositives: ruleUpdate.false_positives ?? [],
      from: ruleUpdate.from ?? 'now-6m',
      // Unlike the create route, immutable comes from the existing rule here
      immutable: existingRule.params.immutable,
      license: ruleUpdate.license,
      outputIndex: ruleUpdate.output_index ?? defaultOutputIndex,
      timelineId: ruleUpdate.timeline_id,
      timelineTitle: ruleUpdate.timeline_title,
      meta: ruleUpdate.meta,
      maxSignals: ruleUpdate.max_signals ?? DEFAULT_MAX_SIGNALS,
      riskScore: ruleUpdate.risk_score,
      riskScoreMapping: ruleUpdate.risk_score_mapping ?? [],
      ruleNameOverride: ruleUpdate.rule_name_override,
      severity: ruleUpdate.severity,
      severityMapping: ruleUpdate.severity_mapping ?? [],
      threat: ruleUpdate.threat ?? [],
      timestampOverride: ruleUpdate.timestamp_override,
      to: ruleUpdate.to ?? 'now',
      references: ruleUpdate.references ?? [],
      note: ruleUpdate.note,
      // Always use the version from the request if specified. If it isn't specified, leave immutable rules alone and
      // increment the version of mutable rules by 1.
      version:
        ruleUpdate.version ?? existingRule.params.immutable
          ? existingRule.params.version
          : existingRule.params.version + 1,
      exceptionsList: ruleUpdate.exceptions_list ?? [],
      ...typeSpecificParams,
    }

Overall it reads fine to me, so I'm ok with leaving it disabled for the file since there's only one function within the file.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
securitySolution 2068 2069 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 7.8MB 7.8MB +338.0B

Distributable file count

id before after diff
default 42777 42779 +2

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
securitySolution 243.6KB 245.0KB +1.5KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@spalger spalger added v7.11.0 and removed v7.11 labels Nov 12, 2020
Copy link
Contributor

@FrankHassanabad FrankHassanabad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked it out, tested it quite a bit with:

  • export rules
  • import rules
  • created rules manually and then through scripting
  • Toggled rules on and off
  • Manually edited a few rules (but not all rule types)

Reviewed the code, saw nothing but just some nits. Overall this looks to be a great improvement and cleanup so 👍 , upwards and onwards.

@marshallmain marshallmain merged commit 4dd25fa into elastic:master Nov 13, 2020
@marshallmain marshallmain deleted the new-rule-schemas branch November 13, 2020 02:22
marshallmain added a commit to marshallmain/kibana that referenced this pull request Nov 13, 2020
…mas (elastic#82462)

* Adds framework for replacing API schemas

* Update integration tests with new schema

* Fix response type on createRule helper

* Add unit tests for new rule schema, add defaults for some array fields, clean up API schema definitions

* Naming updates and linting fixes

* Replace create_rules_bulk_schema and refactor route

* Convert update_rules_route to new schema

* Fix missing name error

* Fix more tests

* Fix import

* Update patch route with internal schema validation

* Reorganize new schema as drop-in replacement for create_rules_schema

* Replace updateRulesSchema with new version

* Cleanup - remove references to specific files within request folder

* Fix imports

* Fix tests

* Allow a few more fields to be undefined in internal schema

* Add static types back to test payloads, add more tests, add NonEmptyArray type builder

* Pull defaults into reusable function
marshallmain added a commit that referenced this pull request Nov 13, 2020
…mas (#82462) (#83367)

* Adds framework for replacing API schemas

* Update integration tests with new schema

* Fix response type on createRule helper

* Add unit tests for new rule schema, add defaults for some array fields, clean up API schema definitions

* Naming updates and linting fixes

* Replace create_rules_bulk_schema and refactor route

* Convert update_rules_route to new schema

* Fix missing name error

* Fix more tests

* Fix import

* Update patch route with internal schema validation

* Reorganize new schema as drop-in replacement for create_rules_schema

* Replace updateRulesSchema with new version

* Cleanup - remove references to specific files within request folder

* Fix imports

* Fix tests

* Allow a few more fields to be undefined in internal schema

* Add static types back to test payloads, add more tests, add NonEmptyArray type builder

* Pull defaults into reusable function
gmmorris added a commit to gmmorris/kibana that referenced this pull request Nov 16, 2020
* master:
  [Security Solution][Detections] Adds framework for replacing API schemas (elastic#82462)
  [Enterprise Search] Share Loading component (elastic#83246)
  Consolidates Jest configuration files and scripts (elastic#82671)
  APM header changes (elastic#82870)
  [Security Solutions] Adds a default for indicator match custom query of *:* (elastic#81727)
  [Security Solution] Note 10k object paging limit on Endpoint list (elastic#82889)
  [packerCache] fix gulp usage, don't archive node_modules (elastic#83327)
  Upgrade Node.js to version 12 (elastic#61587)
  [Actions] Removing placeholders and updating validation messages on connector forms (elastic#82734)
  [Fleet] Rename ingest_manager_api_integration tests fleet_api_integration (elastic#83011)
  [DOCS] Updates Discover docs (elastic#82773)
  [ML] Data frame analytics: Adds map view (elastic#81666)
  enables actions scoped within the stack to register at Basic license (elastic#82931)
@MindyRS MindyRS added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Sep 22, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Detection Rules Security Solution rules and Detection Engine release_note:skip Skip the PR/issue when compiling release notes Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:SIEM v7.11.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants